Skip to content

Conversation

@ilopezluna
Copy link
Contributor

@ilopezluna ilopezluna commented Dec 15, 2025

In this PR, I’m fixing the package command by moving the normalization of the model name into the model distribution. This centralizes the normalization logic in a single place, making it less error-prone and easier to understand.

I’ve also updated the model filtering used by the list command so that it matches the behavior of the Docker CLI.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant architectural improvement by centralizing all model name normalization and ID resolution logic within the pkg/distribution client. This refactoring resolves existing inconsistencies in how model references are interpreted across different parts of the application, most notably fixing a critical bug in the package command. By consolidating this logic, the PR enhances the robustness, predictability, and maintainability of model operations throughout the CLI and various API endpoints, ensuring a more consistent user experience.

Highlights

  • Centralized Model Name Normalization: All logic for normalizing model names and resolving model IDs (e.g., adding default organizations/tags, handling Hugging Face names, resolving short IDs) has been consolidated into a new private normalizeModelName method within the pkg/distribution/distribution/client.go. This ensures consistent handling of model references across the application.
  • Fix for Package Command: This pull request directly addresses and fixes issue Packaging a GGUF model fails with 404 “model not found” when using docker model package #461, where the package command would fail with a 'model not found' error during the tagging process. Comprehensive integration tests have been added for the package command to cover various scenarios, including packaging GGUF files with different tags, context sizes, and custom organizations.
  • Refactored packageModel Function: The packageModel function in cmd/cli/commands/package.go has been refactored to explicitly accept context.Context and desktop.Client as arguments. This change improves the function's testability and makes its dependencies clearer.
  • Simplified Model Handling in CLI and API: Higher-level CLI commands (like list and package) and various API handlers (in desktop, inference/models, and ollama) no longer perform their own model name normalization or ID expansion. They now rely entirely on the centralized logic provided by the distribution client, streamlining code and reducing potential inconsistencies.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

t.Logf("✓ Verified original source %s still exists", tc.sourceRef)
})
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this verification because it's redundant. If a tag fails the individual test will fail, so this test is not adding any additional coverage and it becomes a false positive in some cases (for example when tagging a model using same reference as tag, its a no op, but this test will fail because the expected number of tagged models will not match with the actual)

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors model name normalization and ID resolution by centralizing this logic within the pkg/distribution/distribution/client.go package. Previously, normalization was scattered across cmd/cli/desktop, pkg/inference/models, and pkg/ollama packages. The changes introduce a new private normalizeModelName method and helper functions (looksLikeID, looksLikeDigest, resolveID) in the distribution.Client to handle adding default organizations/tags, converting Hugging Face model names, and resolving short IDs or digests to full model IDs. Consequently, all calls to NormalizeModelName from pkg/inference/models and pkg/ollama have been removed, and the desktop client no longer performs its own model ID expansion. Integration tests for model packaging were added to address issue #461, and existing test cases related to normalization were either updated or removed. Review comments suggest enhancing the looksLikeID and looksLikeDigest functions to be case-insensitive for hexadecimal characters and converting the input id to lowercase in resolveID for more robust ID matching.

Comment on lines 199 to 211
func (c *Client) looksLikeID(s string) bool {
n := len(s)
if n != 12 && n != 64 {
return false
}
for i := 0; i < n; i++ {
ch := s[i]
if !((ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f')) {
return false
}
}
return true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The looksLikeID function currently only validates lowercase hexadecimal characters. Docker IDs are case-insensitive, so this check should be updated to allow uppercase characters as well to make ID recognition more robust. The corresponding test case in TestLooksLikeID should also be updated to expect true for uppercase hex input.

Suggested change
func (c *Client) looksLikeID(s string) bool {
n := len(s)
if n != 12 && n != 64 {
return false
}
for i := 0; i < n; i++ {
ch := s[i]
if !((ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f')) {
return false
}
}
return true
}
func (c *Client) looksLikeID(s string) bool {
n := len(s)
if n != 12 && n != 64 {
return false
}
for i := 0; i < n; i++ {
ch := s[i]
if !((ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f') || (ch >= 'A' && ch <= 'F')) {
return false
}
}
return true
}

Comment on lines 214 to 231
func (c *Client) looksLikeDigest(s string) bool {
const prefix = "sha256:"
if !strings.HasPrefix(s, prefix) {
return false
}
hashPart := s[len(prefix):]
// SHA256 digests must be exactly 64 hex characters
if len(hashPart) != 64 {
return false
}
for i := 0; i < 64; i++ {
ch := hashPart[i]
if !((ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f')) {
return false
}
}
return true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The looksLikeDigest function should handle case-insensitivity for both the sha256: prefix and the hexadecimal digest itself. This makes digest recognition more robust against user input variations like SHA256:... or digests with uppercase hex characters. The corresponding test case in TestLooksLikeDigest should also be updated to expect true for uppercase hex input.

Suggested change
func (c *Client) looksLikeDigest(s string) bool {
const prefix = "sha256:"
if !strings.HasPrefix(s, prefix) {
return false
}
hashPart := s[len(prefix):]
// SHA256 digests must be exactly 64 hex characters
if len(hashPart) != 64 {
return false
}
for i := 0; i < 64; i++ {
ch := hashPart[i]
if !((ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f')) {
return false
}
}
return true
}
func (c *Client) looksLikeDigest(s string) bool {
const prefix = "sha256:"
if !strings.HasPrefix(strings.ToLower(s), prefix) {
return false
}
hashPart := s[len(prefix):]
// SHA256 digests must be exactly 64 hex characters
if len(hashPart) != 64 {
return false
}
for i := 0; i < 64; i++ {
ch := hashPart[i]
if !((ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f') || (ch >= 'A' && ch <= 'F')) {
return false
}
}
return true
}

Comment on lines +235 to +236
func (c *Client) resolveID(id string) string {
models, err := c.ListModels()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To complement the case-insensitive checks in looksLikeID and looksLikeDigest, the id parameter should be converted to lowercase before comparison. Stored model IDs are lowercase, so this ensures that IDs provided with uppercase characters can be correctly resolved.

func (c *Client) resolveID(id string) string {
	id = strings.ToLower(id)
	models, err := c.ListModels()

if modelFilter != "" {
// Normalize the filter to match stored model names (backend normalizes when storing)
normalizedFilter := dmrm.NormalizeModelName(modelFilter)
// If filter doesn't contain '/', prepend default namespace 'ai/'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the model filter logic to match the same logic as docker cli you can filter by repo + tag but not only by tag

@ilopezluna ilopezluna changed the title [WIP] Fix package command Fix package command Dec 15, 2025
@ilopezluna ilopezluna requested a review from a team December 15, 2025 16:37
@ilopezluna ilopezluna marked this pull request as ready for review December 15, 2025 16:37
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • The new normalizeModelName does a ListModels() call whenever an input looks like an ID or digest, which will now run on many paths (PullModel, GetModel, DeleteModel, Tag, PushModel, GetBundle); consider scoping ID resolution to only the operations that actually need short-ID support or adding caching to avoid a full store scan on every call.
  • The updated listModels filter logic now trims non-matching tags from each model instead of returning the full tag list for matching models; double-check that this narrower behavior is intended and that any callers displaying or relying on all tags still behave as expected.
  • Client-side model-ID expansion and HuggingFace normalization were removed from the desktop client (Inspect, InspectOpenAI, Chat, Remove, Tag); ensure the server-side normalization now fully covers previous UX patterns (short IDs, bare names, HF aliases) or explicitly document any remaining differences in how users can reference models.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `normalizeModelName` does a `ListModels()` call whenever an input looks like an ID or digest, which will now run on many paths (PullModel, GetModel, DeleteModel, Tag, PushModel, GetBundle); consider scoping ID resolution to only the operations that actually need short-ID support or adding caching to avoid a full store scan on every call.
- The updated `listModels` filter logic now trims non-matching tags from each model instead of returning the full tag list for matching models; double-check that this narrower behavior is intended and that any callers displaying or relying on all tags still behave as expected.
- Client-side model-ID expansion and HuggingFace normalization were removed from the desktop client (`Inspect`, `InspectOpenAI`, `Chat`, `Remove`, `Tag`); ensure the server-side normalization now fully covers previous UX patterns (short IDs, bare names, HF aliases) or explicitly document any remaining differences in how users can reference models.

## Individual Comments

### Comment 1
<location> `pkg/distribution/distribution/client.go:144-153` </location>
<code_context>
+func (c *Client) normalizeModelName(model string) string {
</code_context>

<issue_to_address>
**issue (bug_risk):** Digest references containing '@' can be mis-normalized and treated as tags

For inputs like `foo@sha256:abcd...`, `SplitN(model, ":", 2)` yields `nameWithOrg="foo@sha256"` and `tag="abcd..."`, then (because there's no slash) the default org is prepended, producing `ai/foo@sha256:abcd...`. This incorrectly turns a digest reference (`repo@digest`) into something that looks like a tag (`repo@sha256:tag`), which can break pulls and deletes that rely on digest semantics. Please explicitly detect `@` and either skip org/tag normalization for digest references or handle them in a separate branch before the `SplitN` logic.
</issue_to_address>

### Comment 2
<location> `cmd/cli/commands/integration_test.go:993-1002` </location>
<code_context>
+// TestIntegration_PackageModel tests packaging a GGUF model file
</code_context>

<issue_to_address>
**suggestion (testing):** Tighten assertions and clarify model counting/cleanup in TestIntegration_PackageModel

Since `listModels` returns a textual representation, `len(models)` is counting characters rather than models, which makes the initial "no models exist" assertion misleading. Instead, either parse the output (if practical) and assert on the number of entries, or assert directly on the raw output, e.g. `require.Equal(t, "", strings.TrimSpace(models))`.

For consistency and clarity with the final cleanup check (which already uses `require.Empty(strings.TrimSpace(models))`), consider using the same pattern in the initial assertion so the test more clearly states what it is verifying.

Suggested implementation:

```golang
	// Ensure no models exist initially (no models listed in textual output)
	models, err := listModels(false, env.client, true, false, "")
	require.NoError(t, err)
	require.Empty(t, strings.TrimSpace(models), "Expected no initial models, but found some")

```

If `strings` is not yet imported in `cmd/cli/commands/integration_test.go`, add it to the import block:

<<<<<<< SEARCH
import (
=======
import (
	"strings"
>>>>>>> REPLACE

Make sure it integrates with the existing import list (respecting any gofmt-organized grouping).
</issue_to_address>

### Comment 3
<location> `cmd/cli/commands/utils_test.go:11` </location>
<code_context>
-	"github.com/docker/model-runner/pkg/inference/models"
 )

-func TestNormalizeModelName(t *testing.T) {
-	tests := []struct {
-		name     string
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for the new list filter behavior to replace the removed NormalizeModelName tests

The previous `models.NormalizeModelName` tests were removed, and while normalization is now covered in `normalize_test.go`, the new list filtering semantics in `list.go` (implicit `ai/` prefix, tag vs repository-only filtering, trimming tags on returned models) don’t appear to be tested.

Please add unit tests that:
- exercise the list logic with a synthetic model list and filters like `"smollm2"`, `"smollm2:latest"`, `"ai/smollm2"`, and HF models, and
- assert both the returned models and their remaining tags.

This helps ensure we stay aligned with Docker CLI semantics.

Suggested implementation:

```golang
	"errors"
	"fmt"
	"testing"
)

type listTestModel struct {
	Name string
	Tags []string
}

// filterModelsForTest replicates the desired list filtering semantics:
//   - implicit "ai/" prefix when no namespace is provided
//   - explicit names like "ai/..." or "hf/..." are honored as-is
//   - "name:tag" form filters by a specific tag
//   - when filtering by tag, only the matching tag is kept on the returned model
func filterModelsForTest(models []listTestModel, ref string) []listTestModel {
	repo, tag := splitRefForTest(ref)

	// Apply implicit "ai/" prefix if no namespace is provided
	if !hasNamespace(repo) {
		repo = "ai/" + repo
	}

	var filtered []listTestModel
	for _, m := range models {
		if m.Name != repo {
			continue
		}

		// If no tag specified, keep all tags
		if tag == "" {
			filtered = append(filtered, m)
			continue
		}

		// Filter tags on the model
		var kept []string
		for _, t := range m.Tags {
			if t == tag {
				kept = append(kept, t)
			}
		}

		if len(kept) > 0 {
			filtered = append(filtered, listTestModel{
				Name: m.Name,
				Tags: kept,
			})
		}
	}

	return filtered
}

func splitRefForTest(ref string) (repo string, tag string) {
	for i := len(ref) - 1; i >= 0; i-- {
		if ref[i] == ':' {
			return ref[:i], ref[i+1:]
		}
	}
	return ref, ""
}

func hasNamespace(ref string) bool {
	for i := 0; i < len(ref); i++ {
		if ref[i] == '/' {
			return true
		}
	}
	return false
}

func TestListFilter_ImplicitAIPrefixAndExplicitHF(t *testing.T) {
	models := []listTestModel{
		{Name: "ai/smollm2", Tags: []string{"latest", "1.0"}},
		{Name: "hf/smollm2", Tags: []string{"latest"}},
		{Name: "ai/other", Tags: []string{"latest"}},
	}

	t.Run("implicit ai/ prefix prefers ai/ namespace", func(t *testing.T) {
		got := filterModelsForTest(models, "smollm2")
		if len(got) != 1 {
			t.Fatalf("expected 1 model, got %d", len(got))
		}
		if got[0].Name != "ai/smollm2" {
			t.Fatalf("expected model name %q, got %q", "ai/smollm2", got[0].Name)
		}
		// Repository-only filter keeps all tags
		if len(got[0].Tags) != 2 {
			t.Fatalf("expected 2 tags, got %d (%v)", len(got[0].Tags), got[0].Tags)
		}
	})

	t.Run("explicit ai/ prefix behaves like implicit", func(t *testing.T) {
		got := filterModelsForTest(models, "ai/smollm2")
		if len(got) != 1 {
			t.Fatalf("expected 1 model, got %d", len(got))
		}
		if got[0].Name != "ai/smollm2" {
			t.Fatalf("expected model name %q, got %q", "ai/smollm2", got[0].Name)
		}
	})

	t.Run("explicit hf/ prefix only matches hf models", func(t *testing.T) {
		got := filterModelsForTest(models, "hf/smollm2")
		if len(got) != 1 {
			t.Fatalf("expected 1 model, got %d", len(got))
		}
		if got[0].Name != "hf/smollm2" {
			t.Fatalf("expected model name %q, got %q", "hf/smollm2", got[0].Name)
		}
	})
}

func TestListFilter_TagVsRepositoryFilteringAndTagTrimming(t *testing.T) {
	models := []listTestModel{
		{Name: "ai/smollm2", Tags: []string{"latest", "1.0"}},
		{Name: "ai/smollm2-base", Tags: []string{"latest"}},
	}

	t.Run("repository-only filter returns all tags", func(t *testing.T) {
		got := filterModelsForTest(models, "smollm2")
		if len(got) != 1 {
			t.Fatalf("expected 1 model, got %d", len(got))
		}
		if got[0].Name != "ai/smollm2" {
			t.Fatalf("expected model name %q, got %q", "ai/smollm2", got[0].Name)
		}
		if len(got[0].Tags) != 2 {
			t.Fatalf("expected 2 tags, got %d (%v)", len(got[0].Tags), got[0].Tags)
		}
	})

	t.Run("name:tag filter returns only that tag", func(t *testing.T) {
		got := filterModelsForTest(models, "smollm2:latest")
		if len(got) != 1 {
			t.Fatalf("expected 1 model, got %d", len(got))
		}
		if got[0].Name != "ai/smollm2" {
			t.Fatalf("expected model name %q, got %q", "ai/smollm2", got[0].Name)
		}
		if len(got[0].Tags) != 1 {
			t.Fatalf("expected 1 tag, got %d (%v)", len(got[0].Tags), got[0].Tags)
		}
		if got[0].Tags[0] != "latest" {
			t.Fatalf("expected tag %q, got %q", "latest", got[0].Tags[0])
		}
	})

	t.Run("name:tag filter does not match non-existing tag", func(t *testing.T) {
		got := filterModelsForTest(models, "smollm2:nonexistent")
		if len(got) != 0 {
			t.Fatalf("expected 0 models, got %d (%v)", len(got), got)
		}
	})
}


```

The tests above encode the expected list filtering semantics (implicit `ai/` prefix, distinction between tag vs repository-only filters, and tag trimming on the returned models) using a local `filterModelsForTest` helper. To ensure these tests cover the real implementation in `list.go` instead of a duplicated helper, you should:

1. Extract the core filtering logic from `list.go` into a small, testable function (e.g. `filterModels(...)`) in a non-internal package or file that can be imported from tests, or
2. If the filtering function already exists (with a different signature/name), update these tests to call that function directly and adjust the `listTestModel` type to match the real model type.

Once wired to the real implementation, the assertions in these tests will guarantee that the list behavior stays aligned with the intended Docker CLI semantics.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@doringeman
Copy link
Contributor

I'll take a closer look at while the lint fails on the CI, but not locally. Feel free to merge.

@ilopezluna ilopezluna merged commit 2c53258 into main Dec 17, 2025
13 of 25 checks passed
@ilopezluna ilopezluna deleted the fix-package-model branch December 17, 2025 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants